Correlation Coefficient Based Average Textual Similarity Model for Information Retrieval System in Wide Area Networks

نویسندگان

  • Jaswinder Singh
  • Parvinder Singh
  • Yogesh Chaba
چکیده

In wide area networks, retrieving the relevant text is a challenging task for information retrieval because most of the information requests are text based. The focus of paper is on the similarity measurement, performance evaluation and design of information retrieval techniques using the four similarity functions i.e. Jaccard, Cosine, Dice and Overlap. The performance evaluation of these similarity functions has been done for the similarity between the documents retrieved by the search engine for the entered text using the vector space model. The correlation coefficient was applied for evaluating the performance of similarity functions. All the possible combination of similarity functions have been explored and textual similarity model has been proposed for the information retrieval system in wide area networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Similarity Functions Used in Textual Information Retrieval in Wide Area Networks

World Wide Web is a rich source of information. It continues to expand in size and complexity with the increasing use of the internet and social media but how to retrieve relevant documents on the Web is becoming a challenge. In this paper there is discussion about the goals, challenges and importance of similarity functions in information retrieval in wide area networks. This paper discusses t...

متن کامل

A New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation

Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...

متن کامل

Efficient Agent-Based Dissemination of Textual Information

We study the problem of efficient dissemination of textual information over wide-area networks. Our dissemination architecture utilises middle-agents and sophisticated matching algorithms. The data model and query language is based on the well-known Boolean model from Information Retrieval. The main focus of this paper is the problem of matching incoming documents with submitted user profiles. ...

متن کامل

Load-Frequency Control: a GA based Bayesian Networks Multi-agent System

Bayesian Networks (BN) provides a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a variety of real-world tasks but they have received little attention in the area of load-frequency control (LFC). In practice, LFC systems use proportional-integral controllers. However since these controllers are designed using a linear model, the nonlinearities...

متن کامل

A Combined Matching Function based Evolutionary Approach for development of Adaptive Information Retrieval System

The growth in the volume of the Web and other textual repositories has made Information Retrieval task difficult, costly and in many cases very complex for the end user. In this context search engines became valuable tools to help users find content relevant to their information needs. However finding relevant information based on user's need is still a challenge. Naturally research on informat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015